MATTEL DATA TEST

Summary

Question 1): In each neighborhood of Ames, what is the median sale price for homes sold in 2006 which have an indoor square footage of greater than or equal to 2000 ft. (excluding porches, garages, decks, and veneers)

Question 2): A client approaches you with a question about the local housing market. They’re interested in whether more homes are sold at certain times of year than others? In other words, is there seasonality?

Question 3): You’re a contractor consulting for a client who wants to remodel their home and then sell it on the market. The home the client occupies is a 3 bedroom / 2 bathroom 1500 sq. ft. house. They’re deciding between the following options for the remodel:

  1. Adding a new bedroom measuring 130 sq. ft.
  2. Adding a new half bathroom measuring 80 sq. ft.
  3. Expanding the living room by 400 sq. ft.

Assume that the cost of all three remodel options is equal. Based on the data provided, which option do you think will provide the greatest predicted increase in home value and why? What other information would you seek out that might help you make the decision? Please provide any visualizations, tables, etc. to support your findings.

Predictions were made using generated data based on Lasso Regression. Because this is a simple linear regression without any interaction, three options would cause linear changes in log(sale price).
I assumed that those new modifications would be above grade and such changes would not change the overall quality of the hosue. However, addtional inforamtion on whether the modification was above grade or in the basement, whether it would add a fireplace, as well as heating would increase prediction accuracy.

Question 4): You own a single-family home (i.e. BldgType = “1Fam”) with 4 bedrooms that you are looking to rent out or sell. Assume you can generate a yearly rent that is 10% of the estimated sales price. Your options include:

  1. Convert the home into a duplex and rent both units.
  2. Rent the home as is.
  3. Sell the home for market value.

Assume that cost is negligible for our purposes. Which option maximizes revenue received in 5 years? 10 years? 15 years? List out all assumptions you are making in your calculations and outline your thought process.

There are 178 data points from the sample dataset that are single-family housing with 4 bedrooms.
It is quite straight forward to calculate the yearly rent.
For conversion to duplex estimation, I divided most of the numerical varaibles by 2 then predicted the sale price using the Lasso model developed earlier. In general, categorical variables and condition related variables were kept as it is.